MACS 30100
University of Chicago
## Classes 'tbl_df', 'tbl' and 'data.frame': 714 obs. of 12 variables:
## $ PassengerId: int 1 2 3 4 5 7 8 9 10 11 ...
## $ Survived : int 0 1 1 1 0 0 0 1 1 1 ...
## $ Pclass : int 3 1 3 1 3 1 3 3 2 3 ...
## $ Name : chr "Braund, Mr. Owen Harris" "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Heikkinen, Miss. Laina" "Futrelle, Mrs. Jacques Heath (Lily May Peel)" ...
## $ Sex : chr "male" "female" "female" "female" ...
## $ Age : num 22 38 26 35 35 54 2 27 14 4 ...
## $ SibSp : int 1 1 0 1 0 0 3 0 1 1 ...
## $ Parch : int 0 0 0 0 0 0 1 2 0 1 ...
## $ Ticket : chr "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ...
## $ Fare : num 7.25 71.28 7.92 53.1 8.05 ...
## $ Cabin : chr "" "C85" "" "C123" ...
## $ Embarked : chr "S" "C" "S" "S" ...
## - attr(*, "na.action")=Class 'omit' Named int [1:177] 6 18 20 27 29 30 32 33 37 43 ...
## .. ..- attr(*, "names")= chr [1:177] "6" "18" "20" "27" ...
| Numeric value | Port |
|---|---|
| 1 | Cherbourg |
| 2 | Queenstown |
| 3 | Southampton |
| Numeric value | Port |
|---|---|
| 1 | Queenstown |
| 2 | Cherbourg |
| 3 | Southampton |
| Numeric value | Port |
|---|---|
| 1 | Southampton |
| 2 | Cherbourg |
| 3 | Queenstown |
Model the probability of \(Y\) rather than model \(Y\) directly
\(p(X) = p(\text{survival} = \text{yes} | \text{age})\)
\[p(\text{Survival}) = \frac{e^{\beta_0 + \beta_{1}\text{Age}}}{1 + e^{\beta_0 + \beta_{1}\text{Age}}}\]
\[p(\text{Survival}) = \frac{e^{\beta_0 + \beta_{1} \times 30}}{1 + e^{\beta_0 + \beta_{1} \times 30}}\]
\[p(\text{Survival}) = \frac{e^{-0.057 + -0.011 \times 30}}{1 + e^{-0.057 + -0.011 \times 30}}\]
\[p(\text{Survival}) = 0.405\]
## term estimate std.error statistic p.value
## 1 (Intercept) -0.0567 0.17358 -0.327 0.7438
## 2 Age -0.0110 0.00533 -2.057 0.0397
\[p(\text{Survival}_{30 - 20}) = \frac{e^{\beta_0 + \beta_{1}30}}{1 + e^{\beta_0 + \beta_{1}30}} - \frac{e^{\beta_0 + \beta_{1}20}}{1 + e^{\beta_0 + \beta_{1}20}}\]
\[p(\text{Survival}_{30 - 20}) = \frac{e^{-0.057 + -0.011 \times 30}}{1 + e^{-0.057 + -0.011 \times 30}} - \frac{e^{-0.057 + -0.011 \times 20}}{1 + e^{-0.057 + -0.011 \times 20}}\]
\[p(\text{Survival}_{30 - 20}) = 0.405 - 0.431\]
\[p(\text{Survival}_{30 - 20}) = -0.0267\]
\[p(\text{Survival}_{50 - 40}) = \frac{e^{\beta_0 + \beta_{1}50}}{1 + e^{\beta_0 + \beta_{1}50}} - \frac{e^{\beta_0 + \beta_{1}40}}{1 + e^{\beta_0 + \beta_{1}40}}\]
\[p(\text{Survival}_{50 - 40}) = \frac{e^{-0.057 + -0.011 \times 50}}{1 + e^{-0.057 + -0.011 \times 50}} - \frac{e^{-0.057 + -0.011 \times 40}}{1 + e^{-0.057 + -0.011 \times 40}}\]
\[p(\text{Survival}_{50 - 40}) = 0.353 - 0.379\]
\[p(\text{Survival}_{50 - 40}) = -0.0254\]
\[p(X) = \frac{e^{\beta_0 + \beta_{1}X_1 + \dots + \beta_{p}X_{p}}}{1 + e^{\beta_0 + \beta_{1}X_1 + \dots + \beta_{p}X_{p}}}\]
\[p(\text{Survival}) = \frac{e^{\beta_0 + \beta_{1}\text{Age} + \beta_{2}\text{Sex}}}{1 + e^{\beta_0 + \beta_{1}\text{Age} + \beta_{2}\text{Sex}}}\]
## term estimate std.error statistic p.value
## 1 (Intercept) 1.27727 0.23017 5.55 2.87e-08
## 2 Age -0.00543 0.00631 -0.86 3.90e-01
## 3 Sexmale -2.46592 0.18538 -13.30 2.26e-40
\[p(X) = \frac{e^{\beta_0 + \beta_{1}X_1 + \beta_{2}X_2}}{1 + e^{\beta_0 + \beta_{1}X_1 + \beta_{2}X_2}}\]
\[p(\text{Survival}) = \frac{e^{\beta_0 + \beta_{1}\text{Age} + \beta_{2}\text{Sex}}}{1 + e^{\beta_0 + \beta_{1}\text{Age} + \beta_{2}\text{Sex}}}\]
\[p(\text{Survival}) = \frac{e^{\beta_0 + \beta_{1}\text{Age} + \beta_{2}\text{Sex} + \beta_{3} \times \text{Age} \times \text{Sex}}}{1 + e^{\beta_0 + \beta_{1}\text{Age} + \beta_{2}\text{Sex} + \beta_{3} \times \text{Age} \times \text{Sex}}}\]
## term estimate std.error statistic p.value
## 1 (Intercept) 0.5938 0.3103 1.91 0.05569
## 2 Age 0.0197 0.0106 1.86 0.06240
## 3 Sexmale -1.3178 0.4084 -3.23 0.00125
## 4 Age:Sexmale -0.0411 0.0136 -3.03 0.00241
\[p(\text{Survival}_{female}) = \frac{e^{\beta_0 + \beta_{1}\text{Age} + \beta_{3} \times \text{Age} \times 0}}{1 + e^{\beta_0 + \beta_{1}\text{Age} + \beta_{3} \times \text{Age} \times 0}}\]
\[p(\text{Survival}_{female}) = \frac{e^{\beta_0 + \beta_{1}\text{Age}}}{1 + e^{\beta_0 + \beta_{1}\text{Age}}}\]
\[p(\text{Survival}_{male}) = \frac{e^{\beta_0 + \beta_{1}\text{Age} + \beta_{3} \times \text{Age} \times 1}}{1 + e^{\beta_0 + \beta_{1}\text{Age} + \beta_{3} \times \text{Age} \times 1}}\]
\[p(\text{Survival}_{male}) = \frac{e^{\beta_0 + \beta_{1}\text{Age} + \beta_{3} \times \text{Age}}}{1 + e^{\beta_0 + \beta_{1}\text{Age} + \beta_{3} \times \text{Age}}}\]
\[p(\text{Survival}_{male}) = \frac{e^{\beta_0 + (\beta_{1} + \beta_{3})\text{Age}}}{1 + e^{\beta_0 + (\beta_{1} + \beta_{3})\text{Age}}}\]
\[p(\text{Survival}_{female}) = \frac{e^{\beta_0 + \beta_{1}\text{Age}}}{1 + e^{\beta_0 + \beta_{1}\text{Age}}}\]
\[p(\text{Survival}_{male}) = \frac{e^{\beta_0 + (\beta_{1} + \beta_{3})\text{Age}}}{1 + e^{\beta_0 + (\beta_{1} + \beta_{3})\text{Age}}}\]
\[PRE = \frac{E_1 - E_2}{E_1}\]
\[PRE_{\text{Age}} = \frac{290 - 290}{290}\]
\[PRE_{\text{Age}} = \frac{0}{290}\]
\[PRE_{\text{Age}} = 0\%\]
\[PRE_{\text{Age x Gender}} = \frac{290 - 157}{290}\]
\[PRE_{\text{Age x Gender}} = \frac{133}{290}\]
\[PRE_{\text{Age x Gender}} = 45.9\%\]
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 360 93
## 1 64 197
##
## Accuracy : 0.78
## 95% CI : (0.748, 0.81)
## No Information Rate : 0.594
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.537
## Mcnemar's Test P-Value : 0.0254
##
## Sensitivity : 0.849
## Specificity : 0.679
## Pos Pred Value : 0.795
## Neg Pred Value : 0.755
## Prevalence : 0.594
## Detection Rate : 0.504
## Detection Prevalence : 0.634
## Balanced Accuracy : 0.764
##
## 'Positive' Class : 0
##
Sensitivity/recall
\(TPR = \frac{\text{Number of actual positives}}{\text{Number of predicted positives}}\)
Specificity
\(TNR = \frac{\text{Number of actual negatives}}{\text{Number of predicted negatives}}\)
Adjusting threshold
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 413 253
## 1 11 37
##
## Accuracy : 0.63
## 95% CI : (0.594, 0.666)
## No Information Rate : 0.594
## P-Value [Acc > NIR] : 0.0256
##
## Kappa : 0.117
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.974
## Specificity : 0.128
## Pos Pred Value : 0.620
## Neg Pred Value : 0.771
## Prevalence : 0.594
## Detection Rate : 0.578
## Detection Prevalence : 0.933
## Balanced Accuracy : 0.551
##
## 'Positive' Class : 0
##